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Learning and memory operations in neural circuits are believed to involve molecular 
cascades of synaptic and nonsynaptic changes that lead to a diverse repertoire of 
dynamical phenomena at higher levels of processing. Hebbian and homeostatic plasticity, 
neuromodulation, and intrinsic excitability all conspire to form and maintain memories. But 
it is still unclear how these seemingly redundant mechanisms could jointly orchestrate 
learning in a more unified system. To this end, a Hebbian learning rule for spiking neurons 
inspired by Bayesian statistics is proposed. In this model, synaptic weights and intrinsic 
currents are adapted on-line upon arrival of single spikes, which initiate a cascade of 
temporally interacting memory traces that locally estimate probabilities associated with 
relative neuronal activation levels. Trace dynamics enable synaptic learning to readily 
demonstrate a spike-timing dependence, stably return to a set-point over long time scales, 
and remain competitive despite this stability. Beyond unsupervised learning, linking the 
traces with an external plasticity-modulating signal enables spike-based reinforcement 
learning. At the postsynaptic neuron, the traces are represented by an activity-dependent 
ion channel that is shown to regulate the input received by a postsynaptic cell and generate 
intrinsic graded persistent firing levels. We show how spike-based Hebbian-Bayesian 
learning can be performed in a simulated inference task using integrate-and-fire (IAF) 
neurons that are Poisson-firing and background-driven, similar to the preferred regime of 
cortical neurons. Our results support the view that neurons can represent information in 
the form of probability distributions, and that probabilistic inference could be a functional 
by-product of coupled synaptic and nonsynaptic mechanisms operating over several 
timescales. The model provides a biophysical realization of Bayesian computation by 
reconciling several observed neural phenomena whose functional effects are only partially 
understood in concert. 



Keywords: Bayes' rule, synaptic plasticity and memory modeling, intrinsic excitability, naive Bayes classifier, 
spiking neural networks, Hebbian learning 



INTRODUCTION 

Bayesian inference provides an intuitive framework for how the 
nervous system could internalize uncertainty about the exter- 
nal environment by optimally combining prior knowledge with 
information accumulated during exposure to sensory evidence. 
Although probabilistic computation has received broad exper- 
imental support across psychophysical models describing the 
perceptual and motor behavior of humans (Wolpert and Kording, 
2004; Knill, 2005; Tassinari et al., 2006), it is nevertheless an 
open theoretical issue at which level of detail within the neu- 
ral substrate it should be embedded (Knill and Pouget, 2004). 
Furthermore, synthesizing a probabilistic perspective with exper- 
imental data is a decidedly non-trivial task (Doya et al., 2007). 
Realizations of disparate phenomena occurring within the corti- 
cal circuitry have been hypothesized to represent viable coding 
schemes for such Bayesian principles, including single neurons 
(Deneve, 2008a,b), neural population responses (Ma et al, 2006; 



Boerlin and Deneve, 2011), specifically within the parietal (Yang 
and Shadlen, 2007) and prefrontal (D'Acremont et al., 2013) cor- 
tices, activation levels in the visual cortical hierarchy (Carpenter 
and Williams, 1995; Rao and Ballard, 1999; Summerfield and 
Koechlin, 2008; Berkes et al., 2011), long-term synaptic plastic- 
ity (Soltani and Wang, 2009), and short-term synaptic plasticity 
(Pfister et al., 2010; Stevenson et al., 2010). However, inductive 
frameworks notoriously tend to impose restrictions about when 
learning should occur (if at all) and account for a fraction of 
the diversity in physiological processes whose given anatomical 
granularity is otherwise arbitrary. 

We propose a spike-based extension of the Bayesian 
Confidence Propagation Neural Network (BCPNN) plastic- 
ity rule (Lansner and Ekeberg, 1989; Lansner and Hoist, 1996) 
to address these issues. In this model, storage and retrieval are 
enabled by gathering statistics about neural input and output 
activity. Synaptic weights are effectively inferred using Bayes' rule 



Frontiers in Synaptic Neuroscience 



www.frontiersin.org 



April 2014 | Volume 6 | Article 8 | 1 



Tully et al. 



Spike-based probabilistic inference 



by incrementally (Sandberg et al., 2002) estimating confidence of 
feature observations from the input and posterior probabilities of 
outcome from the output. Weight modification depends on the 
temporal integration of spikes on different time scales using local 
synaptic traces, whose time courses are inspired by the cascade 
of events involved in the induction and maintenance of Hebbian 
plasticity. These traces estimate probabilities that determine 
synaptic weights and biases, which enable postsynaptic IAF 
neurons to signal through their relative spike rates the posterior 
likelihood of activation upon presentation of evidence in the 
form of presynaptic spiking. 

The model suggests a non-redundant role for the presence of 
and interaction between a range of different processes in approxi- 
mating probabilistic computation. Spike-based BCPNN can learn 
the temporal dimension of the input through modulation of its 
synaptic trace kinetics. Different spike timing-dependent plastic- 
ity (STDP) (Markram et al., 1997; Bi and Poo, 1998; Froemke 
and Dan, 2002) kernels can be predicted that promote learning 
forwards or backwards through time. Crucially, a unimodal sta- 
tionary distribution of synaptic weights naturally follows from 
the learning rule due to an inherent multiplicative decay of the 
weights over long time scales, generating convergence behavior 
that is functionally reminiscent of synaptic scaling (Turrigiano 
et al., 1998). A global neuromodulatory signal is shown to provide 
information about rewards or expected rewards (Florian, 2007). 
The bias term, which represents prior confidence pending input 
evidence, is recast here as a Ca 2+ sensitive, activity-dependent K + 
current whose functional outcome resembles long-term potenti- 
ation of intrinsic excitability (LTP-IE) (Cudmore and Turrigiano, 
2004). This interpretation allows us to replicate experiments 
from cortical neurons that suggested these factors could underlie 
graded persistent changes in firing levels (Egorov et al., 2002). 

Increased efforts have focused on identifying the interplay 
of multiple synaptic (Keck et al., 2012) and even nonsynaptic 
(Habenschuss et al, 2012; Nessler et al., 2013; Savin et al, 2014) 
empirically grounded phenomena that could be relevant for 
learning and inference. In spike-based BCPNN, the use of evolv- 
ing traces that coalesce to estimate probabilistic quantities com- 
plements these approaches by offering a conceivable way in which 
molecular events, which are known to span across different plas- 
ticity modalities (Daoudal and Debanne, 2003) and time scales 
(Tetzlaff et al., 2012), could be interconnected through latent 
probabilistic operations. The proposed model yields insights 
into how local and global computations, viewed through the 
lens of Bayes' rule, could accommodate a complex mixture of 
dynamics thought to be relevant for information processing in 
neocortex. 



value between 0 and 1. This corresponds to the probability of 
that event, given observed events, which are represented by other 
active units. In spike-based BCPNN, units are viewed as local 
populations of 30 spiking neurons (Peters and Yilmaz, 1993), i.e., 
minicolumns, that have similar receptive fields and are highly 
connected and coactive (Mountcastle, 1997; Yoshimura et al., 
2005; Bathellier et al., 2012). Corresponding levels of activa- 
tion for these minicolumns are represented by their average 
spike rate. 

Starting from Bayes' rule for relating the conditional probabil- 
ities of two random variables, observed firing rates collected from 
n presynaptic minicolumns Xi.„ n , i.e., the evidence P(xi...„), can 
better inform the firing probabilities of neurons in the postsynap- 
tic minicolumny ; , i.e., the prior P(yj): 



P(y,\xi... n ) = P(yj) 



P(Xl...n\yj) 
P(Xl...n) 



(1) 



The described learning approach is tantamount to a naive Bayes 
classifier that attempts to estimate the posterior probability dis- 
tribution P(yi 1 3Ci... n ) over a class (e.g., y* = "animal") realized by 
its observed attributes (e.g., Xh = "shape," "color," or "size"). By 
assuming conditional and unconditional independence between 
%!...„, Bayes' rule can be extended by: 



Piyj\xi...„) = P(yj) 



P(xi\yj) P(x 2 \yj) P(x n \ yj ) 



P(xi) P(x 2 ) 



P(Xn) 



(2) 



The assumption of independent marginals above is insignifi- 
cant considering that the denominator of Equation 2 is iden- 
tical for each yj. Thus, relative probabilistic ordering of classes 
remains intact, and probabilities can be recovered by normalizing 
P(yj\x\...n) to sum to 1. If we define each attribute X/, as a discrete 
coded or as an interval coded continuous variable (e.g., Xhi = 
"blue," "yellow," or "pink" for Xh = "color"), a modular network 
topology follows: 



H m 



P(yj\xi... n ) = P(yj) ]"[ I] 



■■ i ;= i 



P(xin\yj)_ 
P{x hl ) J 



(3) 



in which n/, minicolumns are distributed into each of H hyper- 
columns (Figure 1A). Here, n xu represents relative activity or 
uncertainty of the attribute value X/,;, and n Xhi = 1 indicates 
that attribute value Xhi was observed with maximal certainty. 
Equation 3 may instead be equivalently expressed as a sum of 
logarithms by: 



MATERIALS AND METHODS 

DERIVATION OF A PROBABILISTIC LEARNING RULE 

Theoretical underpinnings described in this section are not 
intended to be a novel contribution, but are briefly included 
for completeness (Lansner and Ekeberg, 1989; Lansner and 
Hoist, 1996). Consider a paradigm in which learning and recall 
are probabilistically grounded, associative memory mechanisms. 
According to BCPNN, computational units representing stochas- 
tic events have an associated activation state reflected by a real 



\ogP(y J \x 1 ... n ) = logPOj) + J2 log 



h=\ 



^ P(x hi \yj) 



(4) 



Equation 4 states that contributions via connections from mini- 
columns in the same hypercolumn need to be summed before tak- 
ing the logarithm, then summed again. Such an operation might 
be performed dendritically. More generally, the sum inside the 
logarithm can be approximated by one term through the elimina- 
tion of index h, since there are significantly more hypercolumns 
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FIGURE 1 | Reconciling neuronal and probabilistic spaces using the 
spike-based BCPNN architecture for a postsynaptic minicolumn with 
activity yi. (A) A cartoon of the derived network incorporates H = 5 
hypercolumns each containing n/, = 4 minicolumns that laterally inhibit each 
other (red lines) to perform a WTA operation via local inhibitory 
interneurons (red circles). The dotted gray area is represented by B in detail. 
(B) Weighted input rates N are summed and passed through a transfer 
function to determine the amount of output activation. Connections w Xl y, 
can be viewed as synaptic strengths (black lines, semicircles) or inverted 
directed acyclic graph edges representing the underlying generative model 
of a naive Bayes classifier. 



PROBABILISTIC INFERENCE PERFORMED WITH LOCAL SYNAPTIC 
TRACES 

Spike-based BCPNN is based on memory traces implemented 
as exponentially weighted moving averages (EWMAs) (Roberts, 
1959) of spikes, which were used to estimate P\, Pj, and as 
defined above (Equation 5). Temporal smoothing corresponds to 
integration of neural activity by molecular processes and enables 
manipulation of these traces; it is a technique commonly imple- 
mented in synapse (Kempter et al, 1999) and neuron (Gerstner, 
1995) models. EWMAs can ensure newly presented evidence 
is prioritized over previously learned patterns because as old 
memories decay, they are gradually replaced by more recent ones. 

The dynamics governing the differential equations of the 
learning rule with two input spike trains, S, from presynaptic 
neuron i and Sj from postsynaptic neuron are illustrated in 
Figure 2A. A three-stage EWMA procedure (Figures 2B-D) was 
adopted, the time constants of which were chosen to have a phe- 
nomenological mapping to key plasticity-relevant changes within 
signal transduction pathways that occur during learning. 

The Z, and Z; traces had the fastest dynamics (Figure 2B), and 
were defined as 



than incoming synapses per neuron in mammalian neocortical 
networks. Considering the asymptotically large size and sparse 
connectivity of these networks, it is statistically unlikely that a 
specific hypercolumn would receive more than one incoming 
connection from any other hypercolumn. 

Each hypercolumn is regarded as having normalized activity 
i = 1> anQl sucri canonical connectivity schemes along 
with the winner-take-all (WTA) operations they imply are preva- 
lent throughout neocortex (Douglas and Martin, 2004). Hence 
in analogy to neural transduction, a support value s ; = pj + 

12^= i ™xi w xiyi can be calculated by iterating over the set of possi- 
ble conditioning attribute values N = Hn^ ioryj with weight w Xiy . 
and bias ft update equations (Figure IB): 



Pj = log P(yj) w Xi 



log 



P(xj\yj) 

P(x t ) 



log 



P(xj, y,) 

P{xi)P(yj) 



(5) 



Activity statistics are gathered during learning and their rela- 
tive importance is evaluated and expressed as weights and biases. 
After Bayesian updating, probabilities are recovered by normaliz- 
ing P(yj\x\,.. n ) to sum to 1 over each y ; - by using an exponential 
transfer function since Sj = logP(y,|xi...„): 



P(yj\Xl.., n ) 



e 5 ' 



(6) 



It is important to note that from this point onward, we refer to w 
and p as models of the incoming synaptic strength and excitability 
of a neuron. In the case where multiple synaptic boutons from a 
pre- to postsynaptic target neuron exist, they are represented here 
as a single synapse. 



dZ; 



-Zi + s 



dZi 



*spike 



' dt f m 



■spike 



-Zj + S (7) 



which filtered pre- and postsynaptic activity with time constants 
T Zj , ~ 5-100 ms to match rapid Ca 2+ influx via NMDA recep- 
tors or voltage-gated Ca 2+ channels (Lisman, 1989; Bliss and 
Collingridge, 1993). These events initiate synaptic plasticity and 
can determine the time scale of the coincidence detection window 
for LTP induction (Markram et al, 1997). 

We assumed that each neuron could maximally fire at/ max Hz 
and minimally at e Hz, which represented absolute certainty and 
doubt about the evidential context of the input. Relative uncer- 
tainty was represented by firing levels between these bounds. 
Since every spike event had duration ms, normalizing each 
spike by / max t sp ik e meant that it contributed an appropriate pro- 
portion of overall probability in a given unit of time by making 
the underlying Z trace ~ 1 . This established a linear transforma- 
tion between probability space e {e, 1} and neuronal spike rate 
€ {e,/max}- Placing upper and lower bounds on firing rates was 
reasonable given physiologically relevant firing rates of cortical 
pyramidal neurons (Abeles, 1991). 

The Z traces were passed on to the E or eligibility traces (Klopf, 
1972), which evolved according to (Figure 2C): 



dEj 

'' dt 



Z, 



dEj 
1 dt 



■■Zi 



dEjj 
• dt 



ZiZj 



E,j (8) 



At this stage of the EWMAs, a separate equation was introduced to 
track coincident activity from the Z traces. Eligibility traces have 
been used extensively to simulate delayed reward paradigms in 
previous models (Florian, 2007; Izhikevich, 2007), and are viewed 
as a potential neural mechanism underlying reinforcement learn- 
ing (Pawlak et al., 2010). They enabled simultaneous pre-post 
spiking to trigger a buildup of activity in the E traces, which could 
then be eligible for externally driven neuromodulatory interven- 
tion. The time constant x e ~ 100-1000 ms was assumed to 
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FIGURE 2 | Schematic flow of BCPNN update equations reformulated 
as spike-based plasticity. (A) The S, pre- (A-D, red) and Sj postsynaptic 
(A-D, blue) neuron spike trains are presented as arbitrary example input 
patterns. Each subsequent row (B-D) corresponds to a single stage in the 
EWMA estimate of the terms used in the incremental Bayesian weight 



update. (B) Z traces low pass filter input spike trains with r Zj = t z ,. (C) E 
traces compute a low pass filtered representation of the Z traces at time 
scale x e . Co-activity now enters in a mutual trace (CD, black). (D) E traces 
feed into P traces that have the slowest plasticity and longest memory, 
which is established by t p . 



represent one of the downstream cellular processes that could 
interact with increased intracellular Ca 2+ concentrations, such as 
CaMKII activation (Fukunaga et al., 1993). Creation of a decaying 
tag for each pre-post activated synapse for delivery of a spe- 
cific marker that can be targeted for future plasticity-associated 
protein trafficking (Frey and Morris, 1997) has also been hypoth- 
esized to provide an intermediary step in the transition from early 
to late phase LTP. 

E traces were subsequently passed on to the P traces 
(Figure 2D). Gene expression, protein synthesis and protein cap- 
ture are cellular processes that mediate LTP maintenance and 
long-term memory formation (Nguyen et al., 1994; Frey and 
Morris, 1997). They tend to be activated in late phase LTP by ele- 
vated levels of Ca 2+ dependent protein kinases, akin to activation 
in the P trace dynamics originating from sustained activation in 
the E traces: 



x *lTt 



K(Ei-Pi) Tp 



dPj 
It 



-K(Ej-Pj) Tp- 



dP] 
dt 



k(E,,-P,,) (9) 



Since these processes tend to exhibit highly variable timescales 
lasting anywhere from several seconds up to potentially days or 
months (Abraham, 2003), we simply imposed r Zj , r z . < r e < r p , 
but typically used T p » 10 s for the sake of conciseness in simula- 
tions. Directly regulating the learning rate, parameter k e [0, oo] 
represented the action of an endogenous neuromodulator, e.g., 
dopamine (Schultz et al., 1997), that signaled the relevance of 
recent synaptic events. The P trace is considered a versatile process 
tied closely to the nature of the task at hand by a globally applied 



k (Schultz et al, 1997). Recently stored correlations were prop- 
agated when k 0 and no weight changes take place when k = 
0. Although we show through simulations how delayed reward 
could be implemented with E traces, they are not required for 
inference, and having x e approach 0 would not undermine any 
of the results presented here. 

Probabilities were ultimately fed into the final learning rule 
update equations (Equation 5) used to compute and wy: 



Pj = \og(Pj) Wij = log 



PiPj 



(10) 



To illustrate this process, a learning scheme involving delayed 
rewards is depicted with a pair of connected neurons (Figure 3A). 
In this example, a reward was delivered 1-2 s after coincident 
activity (Waelti et al, 2001) for 500 ms (Gonon, 1997) to rein- 
force deserving stimuli. If r e was too small or positive reward k 
arrived after the E trace had decayed to baseline (Figure 3B), no 
signal was propagated to the P traces. As a result, the correspond- 
ing Pji trace and weight remained unchanged. However, if the E 
trace was sufficiently large such that there was an overlap with 
k, the strength of the synapse grew and associative learning tran- 
spired (Figure 3C). Although only one connection Wy is depicted 
in this example, k would be modulated in the same way for all 
synapses in the network context, typical of dopaminergic neuron 
release characteristics (Waelti et al., 2001). 



Frontiers in Synaptic Neuroscience 



www.frontiersin.org 



April 2014 | Volume 6 | Article 8 | 4 



Tully et al. 



Spike-based probabilistic inference 




FIGURE 3 | Delayed reward learning using £ traces. (A) A pair of 
neurons fire randomly and elicit changes in the pre- (red) and 
postsynaptic (blue) Z traces of a BCPNN synapse connecting them. 
Sometimes by chance (pre before post*, synchronous+, post before 
pre*), the neurons fire coincidentally and the degree of overlap of their 
Z traces (inset, light blue), regardless of their order of firing, is 



propagated to the mutual eligibility trace En. (B) A reward (pink 
rectangular function, not to scale) is delivered as external supervision. 
Resulting E traces are indicated (gray line, r e = 100 ms and black line, 
i g = 1000ms). (C) Behavior of color corresponding Pn traces and 
weights (inset) depends on whether or not the reward reached the 
synapses in ample time. 



LEAKY INTEGRATE-AND-FIRE NEURON MODEL 

Model spikes are generated using NEST version 2 (Gewaltig and 
Deismann, 2007). An IAF neuron with alpha function-shaped 
postsynaptic conductance, NEST model "iaf_cond_alpha" (Kuhn 
et al., 2004), is amended to account for the bias term /j (Equation 
10). It enters the sub-threshold voltage V m equation of the post- 
synaptic neuron according to: 



C m~TT — Sl( V m - El) + 2^Sex, i(V m - E ex , ,) 
j=l 

n 

+ / 1 ginh, i(V m - E m h, ,) + (pip 



(ID 



When threshold V t h is reached (V m > Vtk) a spike is generated 
and V m is held to the reset potential V res for t re f ms represent- 
ing the absolute refractory period. The total current flow across 
the membrane is determined by the membrane capacitance C m , 
the leak reversal potential El, excitatory E ex and inhibitory E{„h 
reversal potentials, the leak conductance gi, excitatory g ex and 
inhibitory g n h synaptic conductances, and Ip that is scaled to rep- 
resent an activity-dependent current quantity by </>. Postsynaptic 
conductances g ex and g ln h are modified by the occurrence of an 
excitatory or inhibitory input event from one of the n presynaptic 
neurons at time t s by: 

t-t s -d i-c-fc-'O 

gex\,nh,i(t) = gmaxWy e r «l«> (12) 

tex\inh 

This enables g ex or g; n /, to rise with finite duration r ex or r;„/, to its 
peak conductance g max at time t — t s — d = r ex or r m h, where d is 
the transmission delay. 

IAF neurons offer an analytically convenient form for describ- 
ing rate of firing dependent upon quantifiable measures of V m . 
We will show in the Results that the input-output relationship 



in a background driven regime is particularly suited for Bayesian 
computations (Equation 6). If we consider an IAF neuron as it 
receives excitatory synaptic drive k ex = n ex { ex w ex T ex e from 
Poisson processes spiking at/ ex Hz with weights w ex = YTi= \ w ij> 
its mean firing rate r can be formulated according to Kuhn et al. 
(2004): 



r(/i- m , (7 m ) — 



1 - erf 



(13) 



where r m = C m /(gi + X ex ) is the effective membrane time con- 
stant, erf is the error function, and the steady state mean /x m and 
standard deviation a m of its V m are estimated by (Figure SI): 



ElGl + Eex^e: 



11exfex(2.T m I T ex ) 



2C m (t m + Tex) 



(14) 



In numerical simulations, neurons were stimulated by Poisson 
spike trains or correlated spike trains, the latter of which were 
generated using the Multiple Interaction Process (Kuhn et al., 
2003) defined in NEST ("mip_generator"). For simulations where 
background activity was present, 30 input Poisson sources stim- 
ulated each neuron to control their background spike rate. The 
values of all synaptic and neuronal parameters used in numerical 
simulations are listed in Table 1. 

RESULTS 

We found that dynamical phenomena emerging from this map- 
ping resembled processes that are thought to underlie learning 
and memory in cortical microcircuits. We first identify the synap- 
tic and nonsynaptic correlates of this extension by studying ensu- 
ing spike dynamics accompanying the individual assumptions of 
the derivation, and then the functionally distinct computations 
are considered together in a network setting where we demon- 
strate a simple Bayesian inference task performed by spiking 
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Table 1 | When parameters are not explicity listed in the text, they are 
interleaved below, following (Nordlie et al., 2009). 

(A) MODEL SUMMARY 



Table 1 | Continued 



Neuron model 


Leaky IAF 


Synapse model 


Conductance-based with cr-shaped PSCs, 
plastic BCPNN synapses 


Channel model 


K + channel 


Input model 


Fixed-rate Poisson spike trains 


Measured quantities 


Spike activity, connection strengths, biases, 
voltages 


(B) NEURON MODEL 


Leaky IAF dynamics 


Subthreshold membrane potential V m of 

neuron / with n inputs: 

-C m ^f = g L (V m - E L ) + £/Li g exJ (V m - 

E ex ,i) + E?=1 9inh, i(Vm ~ Ei nh _ i) + (pip 

Spiking: If V m > spike generated and V m 
held at V res for t re f ms 



Parameters 



(C) CHANNEL MODEL 

Activity-dependent 
hyperpolarizing 



Parameters 



C m = 250 pF membrane capacitance 
gi = 16.67 nS leak conductance 
El = —70 mV leak reversal potential 
E ex = 0 mV excitatory reversal potential 
E m h = —75 mV inhibitory reversal potential 
(j> = 50 pA current scaling factor Figures 8-10, 
OpA otherwise 

V t h = — 55 mV membrane voltage threshold 
V res = —60 mV membrane reset potential 
t re f = 2 ms refractory period 
dt = 0.1 ms time resolution 



K+/CAN current of neuron j, \p s pA: 
t p d 4= K (Ej-Pj) 

hi = <PPj = <P ^g(Pj) 

See Equation 8 for calculation of Et 

z z j = 10 ms Z trace time constant 
T e = 100 ms E trace time constant 
t p = 10000 ms P trace time constant 
Sj = 1 if spike, 0 if no spike 
fmax = 20 Hz, highest firing rate 
e = 1 /(fmaxip) Hz, lowest firing rate 



(D) SYNAPSE MODEL 

ff-shape PSC 
dynamics 



BCPNN synapse 



tspike = 0-1 ms spike duration 



Excitatory g ex and inhibitory g, n f, conductance 
changes for postsynaptic neuron j with spike 
at time f s by one of the n presynaptic neurons: 

gex\inh, /(f) = 9max^'" 



ex\inh 



Synaptic strength between / and /, w,y nS: 

z p d 4 = - Pi), = «(£; - Pj), r p ^ = 

^Eij-P;-) 

wr, = log (A) 

See Equation 8 for calculation of E; E; and E,j 



Parameters 



= 10 ms Z trace time constant 
: 100 ms E trace time constant 





t p = 10000 ms P trace time constant 




9max = 2.0 nS peak conductance 




r ex = 0.2 ms a rise time for excitatory input 




r; n /, — 2 ms a rise time for inhibitory input 




d = 0.1 ms transmission delay 




k = 1 .0, 0.0 to freeze plasticity (Figure 10) 


(E) INPUT 


Poisson generator 


n ex = 30 processes, independent per neuron 




w ex = 10.75 nS per process 




r ex rate, 0 < r ex < /max 


(F) MEASUREMENTS 



/Continued) 



r spike rate (spikes/second) 
Wjj synaptic weight between neurons / and / (nS) 
V m membrane voltage of neuron j (mV) 
/^bias current magnitude (pA) 



VALIDATING SPIKE-BASED BCPNN WITH PREVIOUS 
IMPLEMENTATIONS 

As a proof of concept, we first sought to validate whether using 
EWMAs with input Poisson trains in spike-based BCPNN could 
reliably estimate learning outcomes of an abstract BCPNN where 
units had simple, exponentially smoothed binary activation pat- 
terns (Equation 5) (Sandberg et al, 2002). To demonstrate con- 
sistency, five patterns between two units (binary activations of 1 
or 0) and two neurons (Poisson spike trains firing at/ max or e Hz) 
were instantiated in ten consecutive 200 ms trials. In this setup, we 
set Tp = 1000 ms by design to be less than this 2000 ms presented 
pattern duration. 

By simultaneously presenting proportional unit activity and 
spiking patterns to the pre- (Figure 4A) and postsynaptic 
(Figure 4B) binary output units of abstract BCPNN and IAF 
neurons of spike-based BCPNN, a close correspondence between 
their resulting weight and bias trajectories was confirmed 
(Figure 4C). Five separate cases were tested in order to robustly 
sample statistical relationships among a diverse set of patterns. 
Correlated patterns meant both units/neurons were maximally 
or minimally active/firing in each trial, independent patterns 
denoted uniform sampling of active and inactive patterns for 
both neurons in each trial, anti-correlated patterns meant one 
was active and the other was inactive or vice-versa in each 
trial, both muted meant both were inactive in all trials, and 
post muted meant activity of the presynaptic neuron was uni- 
formly sampled and the postsynaptic one was inactive in all 
trials. 

We found some notable differences between spike-based 
BCPNN and other correlation-based learning rules. Instances in 
which neuron i was highly active and neuron j weakly active (and 
vice versa) led to a decay of wu, which eventually turned neg- 
ative. When and j were both either highly or weakly active, 
Wij increased because correspondingly active or correspondingly 
inactive patterns are indistinguishable from a probabilistic view- 
point. The increase of Wy when i and j were both weakly active 
was linearly dependent upon the three exponentially decaying P 
traces (Equation 9), since they tended to decay toward e in the 
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FIGURE 4 | Spike-based BCPNN estimates abstract BCPNN for different 
input patterns. (A) Pre- and (B) postsynaptic input spike trains. Activation 
patterns (shaded rectangles) of abstract BCPNN units and corresponding 
Poisson spike trains (vertical bars) firing at f max Hz elicited in IAF neurons 
are differentiated by color. (C) Weight and bias (inset) development under 
different protocol for the abstract (dotted) and spike-based (solid) versions 
of the learning rule. Spiking simulations were repeated 100 times and 
averaged, with standard deviations illustrated by the shaded regions. 



absence of any input. When i and j were both highly active, learn- 
ing was virtually instantaneous, or one-shot, since Xp was short 
compared with the stimulus duration. Steady state trace dynamics 
were responsible for the eventual decay of positive weights over 
time, similar to the multiplicative enforcement of constraints pre- 
viously proposed on theoretical grounds (Miller and Mackay, 
1994). Importantly, this built-in compensatory mechanism was 
much slower than weight increases, otherwise its regulatory 
effects would have dampened any transient activity fluctuations 
that could have been relevant for information processing and 
memory. 

PLASTICITY DYNAMICS OF SPIKE-BASED BCPNN 

The spiking setup allowed us to consider more detailed tempo- 
ral aspects of plasticity beyond simple rate-modulated Poisson 
processes. First, we investigated how the temporal relationship 
between pre- and postsynaptic activity influenced expression of 
plasticity in our model. To evaluate the STDP properties of spike- 
based BCPNN, a canonical experimental protocol was simulated 
(Markram et al, 1997; Bi and Poo, 1998) by inducing pre- (f;) 
and postsynaptic (f ; ) spiking in IAF neurons shortly before or 
after one another 60 times at 1 Hz frequency without background 
activity (Figure 5A). 
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FIGURE 5 | STDP function curves are shaped by the Z trace time 
constants. (A) Schematic representation of the STDP conditioning 
protocol. Each pre (blue) — post (green) pairing is repeated for each time 
difference At — f| — ti illustrated in (C-E). (B) Weight dependence for 
positive (Af = 0 ms, solid line) and negative (Af = 50 ms, dashed line) spike 
timings. Compare to Figure 5 of Bi and Poo (1998). (C) Relative change in 
peak synaptic amplitude using r z / = 5 ms, r z y = 5 ms, x B = 100 ms, and 
Tp = 10000ms. This curve is reproduced in (D-F) using dotted lines as a 
reference. (D) The width of the LTP window is determined by the 
magnitude of the Z trace time constants. When t z; - is changed to 2 ms, the 
coincident learning window shifts right. (E) Instead when r z ; is changed to 
2 ms, it shifts left. Note that a decrease in t zi - is thus qualitatively consistent 
with the canonical STDP kernel. (F) Changing the P trace time constant 
influences the amount of LTD. When z p is doubled to 20,000 ms, the 
learned correlations tend to decay at a slower rate. 



The strength of the weight changes were bidirectional and 
weight-dependent (Figure 5B), generally exhibiting LTP for 
tight values of Af = f, — tj and LTD for wider values of Af 
(Figure 5C). The shape of the learning window was dependent 
upon the parameters r zl , r z j, and r p , defining the duration of 
the different memory traces in the model (see Materials and 
Methods). Manipulation of the Z trace time constants changed 
the width of the STDP window, and therefore r z ; and r z j effec- 
tively regulated sensitivity to spike coincidence. Having r z> r ZJ 
generated an asymmetric weight structure that allowed for pri- 
oritization of pre-post timing (+ Af) over post-pre timing (— Af, 
Figure 5D) and vice versa (Figure 5E). The LTD area shrank for 
a constant STDP window width when Tp was increased because 
it induced a longer decay time for the P traces (Figure 5F), 
emphasizing a slowness in learning. Temporally symmetrical 
Hebbian learning was due to an increase of Px as a result 
of the amount of overlap between P; and Pj (see Figure 2D). 
A similar form of LTP based on pre- and postsynaptic spike 
train overlap (Figure S2) has been shown for synapses in slices 
(Kobayashi and Poo, 2004). 
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FIGURE 6 | The BCPNN learning rule exhibits a stable equilibrium 
weight distribution. (A) Progression of averaged rates of firing (3 s bins) 
for the presynaptic (blue) and postsynaptic (black) neurons in the network. 
(B) Setup involves 1000 Poisson-firing presynaptic neurons that drive one 
postsynaptic cell. (C) The BCPNN synaptic strengths recorded every 
100 ms (blue, dotted white line is their instantaneous mean) has an initial 
transient but then remains steady throughout the entire simulation despite 
deviation amongst individual weights within the equilibrium distribution. (D) 
BCPNN weight histogram plotted for the final time epoch is unimodal and 
approximately normally distributed (blue line, /no = 0.0 and cto = 0.38). 



AN EMERGENT APPROACH TO THE STABILITY vs. COMPETITION 
DILEMMA 

Long-term stability can be problematic for correlative learning 
rules (e.g., Figure 5C), since bounded Hebbian synapses 
destabilize plastic networks by maximally potentiating or depress- 
ing synapses. Additional mechanisms such as weight-dependent 
weight changes (van Rossum et al, 2000) or fine tuning of 
window parameters (Kempter and Gerstner, 2001; Babadi and 
Abbott, 2010) have been shown to be able to keep weights 
in check. In contrast, owing to its plasticity dynamics during 
on-line probability estimation, spike-based BCPNN naturally 
demonstrated weight dependence (Figure 5B) along with a sta- 
ble unimodal equilibrium weight distribution when exposed to 
prolonged uncorrelated stimulation. 

We conducted equilibrium experiments (Figures 6, 7) using 
spike-based BCPNN synapses in which each of their mean sta- 
tionary weight distributions were shifted upwards by the lowest 
possible allowed weight. This subtrahend was calculated from 
Equation 10, log(e 2 /0.5 2 ) = log(4e 2 ), or the log minimum 
Py = e 2 (no co-activity) divided by maximum P,P ; = 0.5 2 (both 
pre- and post-neurons are active half of the time) trace values. 
Although this normalization would not occur biologically, it was 
necessary for displaying true equilibrium weight values because 
the average weight distribution ~ 0 after r p ms due to P trace 
decay, and zero-valued average weights would have mitigated any 
postsynaptic response in the absence of background input. To 
demonstrate stability, a postsynaptic neuron is shown steadily fir- 
ing at an average of 7 Hz when innervated by 1000 presynaptic 
input neurons each producing 5 Hz Poisson spike trains due to 
background activity (Figure 6A). Given this setup (Figure 6B), 
the evolution of the renormalized synaptic weights during this 
period settled around 0 (Figure 6C). 

This behavior can be understood by investigating the P traces. 
Initially, both P, and P; increased as presynaptic input elicited 
postsynaptic spiking, growing the value of the denominator from 
Equation 10. In the numerator, the mutual trace P{i built up as 
well, and there was an eventual convergence in the P traces to 
PjPj = Jp^j after an elapsed time tp. Because both neurons fired 
together, the learning rule initially enhanced their connection 
strength, creating an initial transient output rate excursion. But 
as input persisted such that pre- and postsynaptic neurons con- 
tinued firing at constant rates, correlations were eventually lost 
due to P trace decay. Statistically speaking, the signals emitted 
by the two neurons were indistinguishable over long timescales. 
The steady state of the weights ended up approximately Gaussian 
distributed around the quotient log(l) ~ 0 (Figure 6D), inde- 
pendent of the approximate rates for the pre- and postsynaptic 
neurons. This stability was robust to the choice of time constants, 
given relatively constant pre- and postsynaptic firing rates. 

But presence of a unimodal equilibrium weight distribu- 
tion alone does not guarantee competition amongst constituent 
weights. More functionally relevant is a situation where weight 
enhancement in one group of inputs causes a corresponding 
weight reduction among others (Gilson and Fukai, 2011). To 
illustrate competition within the spike-based BCPNN weight 
structure, we selectively introduced pairwise correlation into 
the spike timings of 100 presynaptic cells. The correlated and 



uncorrelated input groups were stimulated to fire at the same 
rate (Figure 7A), so that the only difference in signal between 
neurons of the feedforward network (Figure 7B) was on the 
spike-timing level. Evolution of the weights was recorded for each 
connection (Figure 7C), and a specialized weight structure devel- 
oped dependent upon the correlation coefficient C (Figure 7D). 
The difference between the distributions was calculated as the 
discriminability (Willshaw and Dayan, 1990): 

d , = M+-Mo (i5) 

The variable /x + represented the mean of the correlated distribu- 
tion, /zo the mean of the uncorrelated distribution, and a + ~ ctq 
the standard deviation shared by the two distributions. The equi- 
librium weight distribution shifted proportionally for differing 
amounts of C (Figure 7E). As expected from a competitive mech- 
anism, correlated neurons remained more potentiated beyond r p 
despite underlying long-term stabilizing pressures (see Figure 6). 
To assess the level of competition, we summed the synaptic 
weights for both the correlated and uncorrelated subpopulations 
for increasing C. As the weights stemming from the correlated 
population increased with C, the weights in the uncorrelated 
population decreased in response, while total weight values were 
kept relatively steady (Figure 7F). Furthermore, competition was 
reduced by increasing r zl - and x z ;, which decreased the standard 
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FIGURE 7 | A shift in the weight distribution of correlated neurons 
arises from structured input. (A) Progression of averaged rates of firing 
(3 s bins) for the externally stimulated uncorrelated (blue) and correlated 
(pictured C = 0.2, red) presynaptic neurons, along with the postsynaptic 
(black) neuron they drive in the network. (B) Setup involves 900 
uncorrelated and 100 correlated presynaptic neurons that drive one 
postsynaptic cell. (C) Synaptic strengths recorded every 100 ms from the 
correlated group gradually specialize over time vs. their uncorrelated 
counterparts, resulting in a change in the mean distribution of weights 
(white dotted lines for each, here C = 0.2). (D) Weight histograms plotted 
for the final time epoch are unimodal and approximately normally 
distributed (C = 0.2, n Q = -0.03, mo = 0.34 and tr 0 » "+ = 0.18). (E) The 
separation between these distributions is expressed as d 1 , which increases 
as a function of the input correlation coefficient. (F) Summed weights for 
the correlated (red), uncorrelated (blue), and combined (black) in the final 
epoch as a function of C. (G) Same as in (F) but with t z , and r zl increased 
by a factor of 2. In both instances, the combined weights remain relatively 
constant around Wg = 0, although lower time constants induce more 
substantial differences between the correlated and uncorrelated weights. 
Error bars depict the standard deviation gathered from 50 repeated trials. 



deviation of the terminal weight distribution and reduced the 
importance of each individual spike (Figure 7G). 

INTRINSIC GENERATION OF GRADED PERSISTENT ACTIVITY AS A 
FUNCTIONAL CONSEQUENCE OF p 

In spike-based BCPNN, output firing rates represent the poste- 
rior probability of observing a presented pattern. Although it is 
calculated by exponentiating the support activity (Equation 6), 
exponential input-output curves are rarely measured in experi- 
ments despite the apparent computational benefits of non-linear 



input transformation at the level of single neurons (Koch, 2004). 
To account for these biological constraints, an alternative sce- 
nario is considered in which a neuron is stimulated by excitatory 
Poisson background input such that the mean voltage of its 
membrane potential is subthreshold (Figure 8A) and it fires up 
to intermediate levels. This background-driven regime enables 
spike production due to fluctuations in subthreshold membrane 
voltage, and is thought to approximate in vivo conditions dur- 
ing which cortical neurons are bombarded by ongoing synaptic 
background activity (Destexhe et al., 2001). 

We found that linearly increasing the level of presynaptic 
drive in the presence of background activity caused an expansive 
non-linearity in the IAF input-output curve within a physio- 
logically relevant <1 up to 20 Hz range of firing, which has 
been reported previously for conductance-based IAF neurons 
(Fourcaud-Trocme et al, 2003) and cortical neurons (Rauch et al., 
2003). The time-averaged firing rate was well-approximated by 
an exponential function (Figure 8B). Relating back to Figure 1, 
information deemed relevant in the form of increased activity 
by a subset of presynaptic sources can cause the postsynaptic 
neuron to ascend its activation function. Inhibitory drive could 
dominate if other active presynaptic neurons signaled counter- 
evidence. Although they are excluded here, such interactions 
would not elicit a qualitative deviation in the input-output curve 
from Figure 8B. 

Although functional synaptic aspects have been emphasized 
up until this point, a distinct role for intrinsic plasticity was 
not precluded. The neural input-output relationship is controlled 
by the abundance, kinetics, and biochemical properties of ion 
channels present in the postsynaptic cell membrane. This is rep- 
resented in spike-based BCPNN by the variable Pj, which is 
a function of the prior probability of postsynaptic activity Pj 
(Equation 9, see Figure 8C), and quantifies a general level of 
excitability and spiking for the postsynaptic neuron. Because 
Pj log(e) for minimal and Pj — >• log(l) = 0 for maximal post- 
synaptic firing rates, Pj essentially lowered the probability for 
neurons that were seldom active previously to be driven passed 
threshold in the future. With regards to the statistical inference 
process, this excitability represents an a priori estimate of postsy- 
naptic activation. The intuition is if an event is experienced for the 
first time, it will still be highly unexpected. To account for these 
effects neurally, Pj was treated as a hyperpolarizing current, Ipj, 
that was continuously injected into the IAF neuron according to 
Equation 11. 

The outcome of this type of dynamic modification is illus- 
trated in Figure 8D. The input-output curve was shifted depend- 
ing on Pj, and the same synaptic input caused differing output 
levels. Similarly, LTP-IE provides a priming mechanism that can 
sensitively tune membrane properties of a neuron in response 
to altered levels of incoming synaptic input (Cudmore and 
Turrigiano, 2004). The A-type K + channel gates the outward flow 
of current in an activity-dependent manner prescribed by a loga- 
rithmic transformation of Pj (Hoffman et al., 1997; Daoudal and 
Debanne, 2003; Jung and Hoffman, 2009). The decay of a Ca 2+ - 
activated non-specific cationic (CAN) current mediated by acti- 
vation of transient receptor potential (TRP) channels (Petersson 
et al., 2011) is another candidate that is thought to play a role in 
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FIGURE 8 | Exponential activation function of a lowly firing IAF 
neuron is shifted by an injection of a hyperpolarizing current 
proportional to fy. (A) Voltage trace and resulting long tail distribution 
of a membrane potential histogram from an IAF neuron approaching 
firing threshold of — 55 mV (bin size = 0.15 mV). (B) The input-output 
curve of an IAF neuron with 30 inputs each firing at values listed along 
the abscissa (black, simulated; blue, see Materials and Methods for 
theoretical IAF rate). For low firing frequencies at or below 20 Hz, the 



these graded changes (Fransen et al, 2006). Mirroring the cascad- 
ing trace levels that collectively compute fij, multiple time scales 
of TRP current decay rate have been identified including a fast 
decay of 10 ms (Faber et al., 2006), a medium decay of 200-300 ms 
(Wyart et al., 2005) and a slow decay of 2-3 s (Sidiropoulou et al., 
2009). 

Intrinsic excitability has been conjectured to serve as a mem- 
ory substrate via locally stored information in the form of a 
neuron's activity history. Despite the lack of temporal specificity 
that exists for synapses, intrinsic effects provide an alternative 
computational device that is presumably beneficial for learn- 
ing and memory. We therefore asked how fij could account for 
functional aspects associated with the modulation of intrinsic 
excitability. 

Specifically, we sought to model the rapid changes in intrinsic 
excitability found in slice preparations of layer V neurons from 
entorhinal cortex in rat (Egorov et al., 2002). In this study, initially 
silent neurons were repeatedly depolarized leading to a graded 
increases in their persistent firing levels. It was also shown that 
persistent activity states were deactivated by applying hyperpolar- 
izing steps until quiescence. Figure 9A summarizes this stimulus 
protocol, which was applied to an /^-modulated IAF neuron in 
the presence of background excitation. Duration and magnitude 
of the transient events were parameterized according to Egorov 
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function is approximately exponential (red-dotted fit: 
y = 0.48e°- 29(x - 7 - 18) - 0.47). (C) The bias term shows logarithmically 
increasing firing rate values of the neuron for which it is computed. (D) 
When hyperpolarizing current proportional to Pj is applied, neurons that 
have previously been highly active will be more easily excitable (e.g., 
yellow curve) compared to neurons that have had little recent history of 
firing (e.g., blue curve). Error bars depict the standard deviation 
gathered from 50 repeated experiments. 



et al. (2002), using depolarizing steps of 0.3 nA for 4 s each and 
hyperpolarizing steps of 0.4 nA for 6 s each. The resulting activity 
of the neuron is illustrated by Figure 9B. Stable periods of ele- 
vated and suppressed firing rates were associated with increases 
and decreases in Ipi, respectively. To achieve quantitatively similar 
graded persistent firing levels as was shown in Egorov et al. a r p of 
60 s was used, similar to induction time courses observed for LTP- 
IE in neurons from visual cortex (Cudmore and Turrigiano, 2004) 
and cerebellar deep nuclear neurons (Aizenman and Linden, 
2000). The sustained levels of activation were noisier than the 
in vitro preparation of Egorov et al., presumably due to the 
presence of excitatory synaptic background activity in the model. 

Importantly, the increased rate of firing caused by each depo- 
larizing stimulus application period led to a continuum of levels 
up to / max Hz, rather than discretely coded activity levels (Fransen 
et al, 2006). The number of levels was arbitrary and depended on 
both the magnitude and duration of the pulse, displaying peak 
frequencies (<20 Hz) similar to those that were assumed for/ max . 
To test this, ten depolarizing 2 s current steps were induced, pro- 
ducing a continuum of levels that was approximately linear with 
a regression coefficient of 1.33 (Figure 9B inset, red dotted line). 
Discharges were sustained by changes in the Pj trace (Figure 9C). 
Each depolarizing step led to the generation of spikes which tran- 
siently increased Pj and made ft less negative. Conversely, each 
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FIGURE 9 | The bias term reproduces the graded persistent 
activity found in entorhinal cortical neurons. (A) Stimulation 
protocol. Repetitive depolarizing followed by hyperpolarizing current 
injections (switch occurs at black arrow) of the IAF neuron 
including fy. (B) Peristimulus time histogram (2 s bin width) of the 



elicited discharge. Red bars indicate the time averaged activity of 
each 1 min post-stimulus interval. Time averaged activity of 1 min 
post-stimulus intervals using 0.3 nA depolarizing steps each lasting 2 s 
(stars, red-dotted line: linear fit). (C) Underlying R trace evolution 
during the simulation. 



hyperpolarizing step tended to silence output activity, decreasing 
fij and making it more difficult for the neuron to reach thresh- 
old. A bidirectional effect of fij was apparent here, as excitability 
decreased when the neuron was depotentiated (Daoudal et al., 
2002). 

DEMONSTRATING PROBABILISTIC INFERENCE USING A SIMPLE 
NETWORK 

Up to this point, Wy and fij have been treated independently, but 
by virtue of a shared Pj, this is not always the case in terms of 
network dynamics. A low excitability fij for a historically inac- 
tive neuron would not necessarily detract from the informative 
content of the neuron per se, rather it must be considered in con- 
junction with its incoming weights Wij. It is entirely plausible that 
Wij would be very high. In terms of the inference task, this would 
amount to neurons representing one specific class. To recapitulate 
previous examples, the feature "pink" might only signal the class 
"animal" if a flamingo was part of the training set, since such a 
distinctive feature is statistically rare yet easily classifiable. 

Since neither weights Wy nor biases & alone were able to reli- 
ably predict the outcome of learning, we introduced a simple 
network model (Figure 10A) to show how interwoven synap- 
tic and nonsynaptic computations could perform a Bayesian 
inference task. Input layer minicolumns (X, Y) were all-to-all 
connected to the output layer (X', Y'), each consisting of 30 neu- 
rons. In order to implement WTA, output layer neurons were 
recurrently connected amongst themselves (connection proba- 
bility = 0.2), and reciprocally to an inhibitory population of 
10 neurons (connection probability = 0.5). Ten seconds of 
alternating, orthogonal Poisson stimulation patterns (i.e., / max 



or e Hz) were applied to input layer groups and identically to 
their corresponding output groups. Over the course of training, 
specialized weights w,j developed (Figure 10C) in which con- 
nections between X (Y) and X' (Y') increased in strength since 
they were coactive during training, and connections between 
X (Y) and Y' (X') decreased in strength since their activa- 
tions were temporally disjoint. Since both X' and Y' were active 
for half of the training, their Pi traces saturated at 0.5 (not 
shown). A simulation paradigm was employed in which weights 
were disabled during training (g max = 0) and frozen after learn- 
ing (k = 0) for the sake of simplicity and since such effects 
have been hypothesized to mimic neuromodulatory interactions 
(Hasselmo, 1993). 

The neurons weighed all available evidence and fired accord- 
ing to their inferred Bayesian weights and biases, and levels of 
uncertainty in these patterns were represented by neuronal fir- 
ing rates during recall (Figure 10B). Recall could be performed 
irrespective of the stimulus duration, which was simply chosen 
to match x p here. For the trivial cases in which output neurons 
were presented with the exact input stimulus pattern received 
during training (X&-Y, X&-Y), certainty was exhibited by firing 
rates approaching / max = 20 Hz. The reciprocal readout neurons 
in these scenarios had a lower level of belief in the incoming pat- 
terns due to the inhibition that developed between both sets of 
anti-correlated groups during training. In more interesting sce- 
narios, output layer neurons displayed intermediate firing rates 
when both input populations were active (X&Y) due to inhibi- 
tion of the novel pattern, and responded with uncertainty without 
any input pattern (-X&-Y), as their activity was dominated by 
fij in the absence of presynaptic input. WTA ensured that either 
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FIGURE 10 | Spiking BCPNN performs a simple Bayesian inference. (A) 

Network architecture with excitatory (black) and inhibitory (gray) 
connections between local minicolumns. Input neurons of groups X and Y 
each project to the output layer X' (green) and Y' (blue), which mutually 
inhibit each other via an inhibitory WTA population (gray). (B) Posterior 
probability distributions are reflected by the output rates of postsynaptic 
neuron pools X' and Y' (colors corresponding to A) in 1 s bins during recall. 
(C) Evolution of the mean weight matrix during training, where each cell 
represents the averaged activity for al! 900 connections. Three snapshots 
were taken during learning: one at the beginning, one a tenth of the way 
through, and one at the end of the simulation. Weights that were 
developed in alternating 200 ms intervals were initially volatile, but 
eventually settled into a symmetrical terminal weight structure. 



group X' or Y' temporarily won, or fired at / max Hz while their 
counterpart was silent. This meant in both cases (X&Y) and 

(-X&-Y), neurons tended to fire on average at Hz in this 
simple example. 

DISCUSSION 

That the brain could encode probabilistic models is a radi- 
cal departure from classical approaches in neuroscience, which 
assume a bottom-up mechanistic view of computational units 
as input filters. Nevertheless, given that both human behavior 
in psychophysical tasks (Wolpert and Kording, 2004; Knill, 2005; 
Tassinari et al., 2006) and recorded neural activity in different 
brain areas (Carpenter and Williams, 1995; Rao and Ballard, 
1999; Yang and Shadlen, 2007; Summerfield and Koechlin, 2008; 
Berkes et al., 2011; D'Acremont et al., 2013) have been shown 
to be able to carry out probabilistic operations, it has been 
suggested that a Bayesian coding hypothesis may be a generic 
property of neural computation. Models have been devised to 
show how Bayesian inference could be carried out by neurons 
and/or their networks, demonstrating various levels of neurobi- 
ological realism and capturing several general properties thought 
to be relevant for information processing. Here, we have recon- 
ciled several of these properties by showing that the extension 
of BCPNN to the domain of spiking neurons enables a rich col- 
lection of dynamics that collectively approximate probabilistic 
inference. 



INTERPRETATION OF POSITIVE AND NEGATIVE SYNAPTIC WEIGHTS IN 
THE MODEL 

Weights in the proposed model can switch between positive and 
negative values, such that an excitatory synapse may become 
inhibitory and vice-versa. A monosynaptic excitatory connec- 
tion with conductance determined by the positive component of 
Wij could exist in parallel with a disynaptic inhibitory connec- 
tion set by the negative component. Evidence for this putative 
feedforward inhibitory microcircuit has been shown to be associ- 
ated with postsynaptic spike rate (Mathews and Diamond, 2003; 
Mori et al., 2004) or interneuron bypassing (Ren et al, 2007). 
Upon observing evidence that does not support the a priori belief 
level, the efficacy of synaptic transmission to excitatory sources 
via inhibitory interneurons neurons would increase, indirectly 
creating a net inhibitory drive. A direct channel would be pre- 
ferred when the neuron is highly certain regarding the statistics 
of its input, so that the net effect would instead be excitatory. 
Since plastic weights turn negative, our model also implicitly 
assumes the presence of inhibitory plasticity (Kullmann et al., 
2012), which has been previously investigated in the context of 
this disynaptic feedforward configuration (Vogels et al., 201 1). 

BIOLOGICAL CORRELATES 

Plastic changes within biological memory systems are temporally 
dynamic phenomena, and arise as a result of biochemical cascades 
that are hierarchically coupled together at the molecular level. 
Despite this, and not least for reasons of computational conve- 
nience, phenomenological models of plasticity implicitly neglect 
both the contribution of the underlying biochemical pathways 
to the overarching computation along with their wide rang- 
ing timescales of operation. Furthermore, there is typically no 
explicit representation of memory age, thus rendering it impos- 
sible to take into account the relative familiarity of young or 
old memories. In contrast, our model explicitly implements the 
palimpsest property: three simple first order linear ordinary dif- 
ferential equations acting as temporally heterogeneous memory 
traces jointly serve the roles of assessing the novelty of the pre- 
sented pattern on-line and estimating the relative probabilities 
used to perform inference (Sandberg et al., 2002). 

The functional outcome of cascading memory traces at the 
synaptic level was a correlative Hebbian learning window with 
shape and relative width determined by r z ; and r z j. Preference for 
a left- or right-shifted temporal window has been shown in dif- 
ferent experimental preparations (Froemke and Dan, 2002; Testa- 
Silva et al., 2010), and it is thought that temporal asymmetry may 
be attributable to the differential induction of NMDA-mediated 
LTP (Abbott and Blum, 1996). Strong connections could develop 
between pools of neurons in a directionally specific manner 
(Abeles, 1991) during a training period of externally applied 
input (Sompolinsky and Kanter, 1986). Stored patterns could 
then be sequentially recalled forwards or backwards through time 
depending on whether r ZI - > r» or r ZI - < r z j. 

Associative learning typically leads to runaway excitation or 
quiescence in the network context. There are modifications of 
learning rules that maintain stability, such as STDP models 
with multiplicative dependence of the change in weight on the 
strength of the synapse (van Rossum et al., 2000), which produce 
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experimentally motivated unimodal equilibrium weight distribu- 
tions (Song et al., 2005). Competition between synapses can be 
achieved using terms that account for activity dependent scaling 
(van Rossum et al., 2000), intermediate STDP rule parameteri- 
zations (Giitig et al, 2003), or a tuned STDP rule to fit a long- 
tailed weight distribution (Gilson and Fukai, 2011). Spike-based 
BCPNN demonstrates coexisting competition and stability that 
emerge from the statistical assumptions accompanying Bayesian 
weight updating. Such alternatives are relevant given increasing 
questions surrounding the ubiquity (Abbott and Nelson, 2000), 
fidelity (Lisman and Spruston, 2010) and precision (Kempter and 
Gerstner, 2001; Babadi and Abbott, 2010) of asymmetrical STDP 
as a generic biological learning rule. 

One hypothesis for how stability can be achieved by neural cir- 
cuits is that Ca 2+ sensor pathways homeostatically regulate recep- 
tor trafficking to keep neuronal firing rates within a preferred 
regime (Rutherford et al, 1998; Turrigiano et al., 1998). Although 
spike-based BCPNN exhibited Hebbian synaptic plasticity, a reg- 
ulatory mechanism arose that was able to both stabilize network 
activity and preserve existing memories. Activity could remain 
stable despite correlation-based changes in synaptic strength, and 
weights could be scaled down in a competitive manner when 
subsets of neurons were potentiated (Figures 6, 7). Thus, rela- 
tive differences in synaptic efficacies could be preserved, similar 
to what is to be expected from synaptic scaling. This activity- 
dependent homeostatic mechanism is not unique to excitatory 
synapses. In spike-based BCPNN, negative Wy increased when 
pre- and postsynaptic neurons were weakly active (Figure 4), 
which was justified from a probabilistic point of view. Given the 
interpretation of negative weights (see Interpretation of Positive 
and Negative Synaptic Weights in the Model), similar behavior 
would be expected due to an antagonistic upregulation of activity 
as a result of inhibitory synaptic scaling targeting pyramidal cells 
(Kilman et al., 2002). 

Shared synaptic and nonsynaptic P traces in spike-based 
BCPNN suggest a novel probabilistic role for the integration of 
neural activity arising from molecular processes. Since the Pi trace 
appears in the computation of both and Wy, the model predicts 
coexpression of LTP/LTD and LTP-IE due to shared intracellu- 
lar postsynaptic Ca 2+ signaling cascades (Tsubokawa et al., 2000; 
Zhang and Linden, 2003). Indeed, LTP-IE is thought to share 
many common induction and expression pathways with LTP/LTD 
(Daoudal and Debanne, 2003), and experimental protocols used 
to study synaptic plasticity have often been shown to incidentally 
give rise to LTP-IE (Bliss and Lomo, 1973; Aizenman and Linden, 
2000; Daoudal et al, 2002). As in LTP/LTD, LTP-IE is rapidly 
induced and long-lasting (Aizenman and Linden, 2000; Cudmore 
and Turrigiano, 2004), consistent with the notion of r p . 

RELATED WORK 

Several previous approaches have represented probabilities 
explicitly or intermediately using measures of neural activity. 
Compelling models have been proposed based on probabilis- 
tic population coding (Ma et al., 2006), where the variability 
within a population response encodes uncertainty in the stimu- 
lus, and belief propagation (Rao, 2005; Litvak and Ullman, 2009; 
Steimer et al., 2009), in which relevant states are estimated using 



internodal communication of messages that are alternatingly 
summed and multiplied over factor graphs. Linking a proba- 
bilistic modeling approach with multiple synergistic biological 
processes has recently been emphasized. Coupled synaptic plas- 
ticity and synaptic scaling (Keck et al., 2012) along with coupled 
STDP and homeostatic intrinsic excitability (Nessler et al., 2013) 
have been proposed in the context of the expectation maxi- 
mization algorithm, whereas a model with coupled synaptic and 
intrinsic plasticity has been implemented using Gibbs sampling 
(Savin et al, 2014). This approach adopts a different machine 
learning-inspired algorithm, namely the naive Bayes classifier. 
Despite its underlying independence assumptions, Naive Bayes 
is known to perform surprisingly well in machine learning tasks 
compared with other advanced methods (Langley et al., 1992), 
and it is a subject of future work to develop biologically moti- 
vated benchmarks for these approaches in the domain of spiking 
neuronal networks. 

Spike-based BCPNN was not intended to phenomenologically 
describe neurophysiological results. Rather, these similarities arise 
naturally from theoretically and biologically constrained assump- 
tions. Learning in our model is based on three consecutively-fed 
traces that were temporally compatible with the signaling cas- 
cades of cellular processes underlying the induction of LTP and 
LTP-IE, and allowed each one to play a unique computational 
role during the online estimation of probabilities. Including mul- 
tiple time scales in an attempt to more accurately capture the wide 
variety of molecular processes involved in memory has also been 
argued for in previous models (Fusi et al., 2005; Clopath et al., 
2008). Another model hypothesized a memory scheme whereby 
LTP and LTP-IE could interact (Janowitz and van Rossum, 2006), 
but updates were asynchronous, which is difficult to recon- 
cile with the coordinated interdependence known from biology 
(Daoudal and Debanne, 2003) and shown here for spike-based 
BCPNN. 

Bayesian learning rules typically introduce rather specific 
assumptions about the makeup of activity or connectivity in 
the underlying neural circuit, and the one presented here intro- 
duces topological structure in the form of a WTA hypercol- 
umn microcircuit. As for our model, this has previously been 
achieved by lateral inhibition (Nessler et al, 2013). In others, sim- 
ilar conditions were fulfilled by homeostatic intrinsic excitability 
(Habenschuss et al, 2012) and feedforward inhibition (Keck et al., 
2012). Here, WTA normalizes outputs based on Equation 6 so 
that approximated posterior probabilities never exceed 1 within 
a hypercolumn. In biology, this normalization could be medi- 
ated by basket cell inhibition between local neural populations, 
a generic motif thought to be fundamental to cortical network 
organization (Douglas and Martin, 2004). 

In spike-based BCPNN, such local neural populations, i.e., 
minicolumns, represent stochastic computational units. The 
probability of an event is reflected by the probability that its cor- 
responding neurons spike during a given time step. Such consid- 
erations are advantageous from the perspective of neuromorphic 
hardware, in which Poisson-like noise and trial-to-trial variability 
physically manifest themselves as electronic phenomena. In the 
same vein, neural sampling (Buesing et al, 2011; Pecevski et al., 
2011) has been proposed in which relevant computational units 
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are not ensembles or columns of neurons but rather the stochas- 
tically firing neurons themselves. In both of these approaches, 
each spike carries a semantic interpretation. Several other mod- 
els also take this viewpoint for spikes, and moreover utilize these 
input spikes for learning (Deneve, 2008b; Nessler et al., 2013). 
In our model, the presence of a spike during a given time step 
signified an increase in confidence that the participating neurons 
are part of the presented pattern. The conductance-based neuron 
model we used is relatively detailed considering its alternatively 
proposed interpretation in terms of latent probabilistic opera- 
tions, although IAF dynamics have been exploited elsewhere in 
this context (Deneve, 2008a). 

Care was taken to ensure that extension of spike-based BCPNN 
did not deviate from previous abstract implementations (Lansner 
and Ekeberg, 1989; Lansner and Hoist, 1996). In doing so, the 
model here provides a direct way of exploring the spiking dynam- 
ics of systems in which BCPNN has been implicated, including 
neocortex (Sandberg et al., 2003; Johansson and Lansner, 2007; 
Lansner et al., 2013) and basal ganglia (Berthet et al., 2012). 
Such a step is necessary toward the goal of linking detailed neu- 
ral mechanisms with complex probabilistic computations. Our 
approach can naturally be extended to the recurrent setting using 
the attractor memory paradigm, considered one of the most pow- 
erful tools for describing non-linear network dynamics (Lansner, 
2009) yet notably absent thus far in the context of spiking models 
that incorporate probabilistic learning and inference. 

In summary, we have described how a simple microcir- 
cuit comprised of intrinsically excitable conductance-based IAF 
neurons, interconnected by synapses endowed with correla- 
tive weight-dependent Hebbian-Bayesian plasticity, could readily 
approximate Bayesian computation. Spike-based BCPNN pro- 
poses a novel way of linking biochemical processes at the sub- 
cellular level and Poisson-like variability at the neuron level with 
complex probabilistic computations at the microcircuit level. 
It implies that the presence of a spike, or lack thereof, not 
only enacts measurable changes in the biochemical makeup of 
synapses and cells, but moreover contributes to an underlying, 
ongoing inference process. 
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